Skip to content

ARROW-534 - WIP#345

Closed
tebeka wants to merge 2 commits into
apache:masterfrom
tebeka:integration
Closed

ARROW-534 - WIP#345
tebeka wants to merge 2 commits into
apache:masterfrom
tebeka:integration

Conversation

@tebeka

@tebeka tebeka commented Feb 20, 2017

Copy link
Copy Markdown
Contributor

Just for code review, not final code.

Comment thread cpp/src/arrow/ipc/json-internal.cc Outdated

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Timestamps before 1970 are negative, so: yes.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK :)

Comment thread integration/data/simple.json Outdated

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you also add data generator fixtures to integration_test.py?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK

@wesm

wesm commented Feb 23, 2017

Copy link
Copy Markdown
Member

This will have some minor rebase conflicts with #347

@wesm

wesm commented Feb 24, 2017

Copy link
Copy Markdown
Member

Small rebase required

@wesm wesm left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to add date and time test cases for https://github.com/apache/arrow/blob/master/cpp/src/arrow/ipc/ipc-adapter-test.cc#L178 and https://github.com/apache/arrow/blob/master/cpp/src/arrow/ipc/ipc-file-test.cc#L183.

see e.g. https://github.com/apache/arrow/blob/master/cpp/src/arrow/ipc/test-common.h#L283

Integration tests are blocked until ipc-file-test and ipc-adapter-test are passing , then we can add the data generators to integration_test.py. We'll also need to wait for ARROW-582 to get done

Comment thread cpp/src/arrow/ipc/json-internal.cc Outdated

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does rapidjson handle integers over 2^53? If it doesn't have any problems then this is OK

@tebeka tebeka Mar 1, 2017

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See so, the following code works:

#include <iostream>
#include <sstream>
#include "rapidjson/document.h"
#include "rapidjson/istreamwrapper.h"

using namespace rapidjson;

int
main() {

    std::stringstream in;
    in << "{";
    // 4611686018427387917
    in << "\"x\": " << ((uint64_t(1)<<62) + 13) << ",";
    in << "\"y\": 2";
    in << "}";
    IStreamWrapper isw(in);
    Document doc;

    doc.ParseStream(isw);
    int64_t x = doc["x"].GetInt64();
    std::cout << "x = " << x << std::endl;
    int64_t y = doc["y"].GetInt64();
    std::cout << "y = " << y << std::endl;
}

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good to know. Suppose we'll have the proof in the working integration tests anyway

Comment thread cpp/src/arrow/ipc/json-internal.cc Outdated

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Timestamps are always signed

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, fixing

Comment thread cpp/src/arrow/ipc/json-internal.cc Outdated

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you combine the logic for TimeType and TimestampType into a single function?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. WIll do

Comment thread cpp/src/arrow/ipc/json-internal.cc Outdated

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you use the same parsing code as for integers?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, will try to add is_base_of to the ReadArray method for PrimitiveCType and BooleanType.

Comment thread cpp/src/arrow/type.cc Outdated

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be deleted

Comment thread cpp/src/arrow/type.h Outdated

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TimeUnit is a strongly typed enum, so these MIN/MAX fields aren't necessary. In Flatbuffers, the MIN/MAX values are these to help account for NULL.

Comment thread cpp/src/arrow/ipc/metadata-internal.cc Outdated

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If unit is an invalid value, I don't believe it will have made it this far

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where will we fail? In the flatbuffer side of things?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, i.e. it should not be possible to return an invalid TimeUnit from a flatbuffer

@wesm

wesm commented Mar 1, 2017

Copy link
Copy Markdown
Member

@wesm

wesm commented Mar 12, 2017

Copy link
Copy Markdown
Member

@tebeka I opened https://issues.apache.org/jira/browse/ARROW-620 -- since you started on the JSON support in this patch, if you have time can you take this up in a new patch?

@tebeka

tebeka commented Mar 13, 2017

Copy link
Copy Markdown
Contributor Author

@wesm OK, will look into ARROW-620

jeffknupp pushed a commit to jeffknupp/arrow that referenced this pull request Mar 15, 2017
Closes apache#345. I had mostly done this in apache#361 so this adds tests to `ipc-adapter-test`

Author: Wes McKinney <wes.mckinney@twosigma.com>

Closes apache#371 from wesm/ARROW-534 and squashes the following commits:

cab6d4f [Wes McKinney] Add functions to make record batches for date, date32, timestamp, time. Fix bugs
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants